Deep Learning Policy Quantization

نویسندگان

Jos van de Wolfshaar

Marco Wiering

Lambert Schomaker

چکیده

We introduce a novel type of actor-critic approach for deep reinforcement learning which is based on learning vector quantization. We replace the softmax operator of the policy with a more general and more flexible operator that is similar to the robust soft learning vector quantization algorithm. We compare our approach to the default A3C architecture on three Atari 2600 games and a simplistic game called Catch. We show that the proposed algorithm outperforms the softmax architecture on Catch. On the Atari games, we observe a nonunanimous pattern in terms of the best performing model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

P-V-L Deep: A Big Data Analytics Solution for Now-casting in Monetary Policy

The development of new technologies has confronted the entire domain of science and industry with issues of big data's scalability as well as its integration with the purpose of forecasting analytics in its life cycle. In predictive analytics, the forecast of near-future and recent past - or in other words, the now-casting - is the continuous study of real-time events and constantly updated whe...

متن کامل

Deep Reinforcement Learning of Video Games

The ability to learn is arguably the most crucial aspect of human intelligence. In reinforcement learning, we attempt to formalize a certain type of learning that is based on rewards and penalties. These supervisory signals should guide an agent to learn optimal behavior. In particular, this research focuses on deep reinforcement learning, where the agent should learn to play video games solely...

متن کامل

Accurate Deep Representation Quantization with Gradient Snapping Layer for Similarity Search

Recent advance of large scale similarity search involves using deeply learned representations to improve the search accuracy and use vector quantization methods to increase the search speed. However, how to learn deep representations that strongly preserve similarities between data pairs and can be accurately quantized via vector quantization remains a challenging task. Existing methods simply ...

متن کامل

Competitive Reinforcement Learning in Continuous Control Tasks

This paper describes a novel hybrid reinforcement learning algorithm, Sarsa Learning Vector Quantization (SLVQ), that leaves the reinforcement part intact but employs a more effective representation of the policy function using a piecewise constant function based upon “policy prototypes.” The prototypes correspond to the pattern classes induced by the Voronoi tessellation generated by self-orga...

متن کامل

Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding

Deep Compression is a three stage compression pipeline: pruning, quantization and Huffman coding. Pruning reduces the number of weights by 10x, quantization further improves the compression rate between 27x and 31x. Huffman coding gives more compression: between 35x and 49x. The compression rate already included the metadata for sparse representation. Deep Compression doesn’t incur loss of accu...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2018

Deep Learning Policy Quantization

نویسندگان

چکیده

منابع مشابه

P-V-L Deep: A Big Data Analytics Solution for Now-casting in Monetary Policy

Deep Reinforcement Learning of Video Games

Accurate Deep Representation Quantization with Gradient Snapping Layer for Similarity Search

Competitive Reinforcement Learning in Continuous Control Tasks

Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding

عنوان ژورنال:

اشتراک گذاری